Support Vector Machines
Structural interpretability in SVMs with truncated orthogonal polynomial kernels
Soto-Larrosa, Víctor, Torrado, Nuria, Huertas, Edmundo J.
We study post-training interpretability for Support Vector Machines (SVMs) built from truncated orthogonal polynomial kernels. Since the associated reproducing kernel Hilbert space is finite-dimensional and admits an explicit tensor-product orthonormal basis, the fitted decision function can be expanded exactly in intrinsic RKHS coordinates. This leads to Orthogonal Representation Contribution Analysis (ORCA), a diagnostic framework based on normalized Orthogonal Kernel Contribution (OKC) indices. These indices quantify how the squared RKHS norm of the classifier is distributed across interaction orders, total polynomial degrees, marginal coordinate effects, and pairwise contributions. The methodology is fully post-training and requires neither surrogate models nor retraining. We illustrate its diagnostic value on a synthetic double-spiral problem and on a real five-dimensional echocardiogram dataset. The results show that the proposed indices reveal structural aspects of model complexity that are not captured by predictive accuracy alone.
- Europe > Spain > Galicia > Madrid (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Spectral methods: crucial for machine learning, natural for quantum computers?
Belis, Vasilis, Bowles, Joseph, Gupta, Rishabh, Peters, Evan, Schuld, Maria
This article presents an argument for why quantum computers could unlock new methods for machine learning. We argue that spectral methods, in particular those that learn, regularise, or otherwise manipulate the Fourier spectrum of a machine learning model, are often natural for quantum computers. For example, if a generative machine learning model is represented by a quantum state, the Quantum Fourier Transform allows us to manipulate the Fourier spectrum of the state using the entire toolbox of quantum routines, an operation that is usually prohibitive for classical models. At the same time, spectral methods are surprisingly fundamental to machine learning: A spectral bias has recently been hypothesised to be the core principle behind the success of deep learning; support vector machines have been known for decades to regularise in Fourier space, and convolutional neural nets build filters in the Fourier space of images. Could, then, quantum computing open fundamentally different, much more direct and resource-efficient ways to design the spectral properties of a model? We discuss this potential in detail here, hoping to stimulate a direction in quantum machine learning research that puts the question of ``why quantum?'' first.
- North America > United States > New York (0.04)
- North America > United States > Maryland (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Sparse $ε$ insensitive zone bounded asymmetric elastic net support vector machines for pattern classification
Existing support vector machines(SVM) models are sensitive to noise and lack sparsity, which limits their performance. To address these issues, we combine the elastic net loss with a robust loss framework to construct a sparse $\varepsilon$-insensitive bounded asymmetric elastic net loss, and integrate it with SVM to build $\varepsilon$ Insensitive Zone Bounded Asymmetric Elastic Net Loss-based SVM($\varepsilon$-BAEN-SVM). $\varepsilon$-BAEN-SVM is both sparse and robust. Sparsity is proven by showing that samples inside the $\varepsilon$-insensitive band are not support vectors. Robustness is theoretically guaranteed because the influence function is bounded. To solve the non-convex optimization problem, we design a half-quadratic algorithm based on clipping dual coordinate descent. It transforms the problem into a series of weighted subproblems, improving computational efficiency via the $\varepsilon$ parameter. Experiments on simulated and real datasets show that $\varepsilon$-BAEN-SVM outperforms traditional and existing robust SVMs. It balances sparsity and robustness well in noisy environments. Statistical tests confirm its superiority. Under the Gaussian kernel, it achieves better accuracy and noise insensitivity, validating its effectiveness and practical value.
- North America > United States > Wisconsin (0.05)
- Europe > Portugal > Coimbra > Coimbra (0.05)
- Asia > China > Chongqing Province > Chongqing (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
CoNBONet: Conformalized Neuroscience-inspired Bayesian Operator Network for Reliability Analysis
Garg, Shailesh, Chakraborty, Souvik
Time-dependent reliability analysis of nonlinear dynamical systems under stochastic excitations is a critical yet computationally demanding task. Conventional approaches, such as Monte Carlo simulation, necessitate repeated evaluations of computationally expensive numerical solvers, leading to significant computational bottlenecks. To address this challenge, we propose \textit{CoNBONet}, a neuroscience-inspired surrogate model that enables fast, energy-efficient, and uncertainty-aware reliability analysis, providing a scalable alternative to techniques such as Monte Carlo simulations. CoNBONet, short for \textbf{Co}nformalized \textbf{N}euroscience-inspired \textbf{B}ayesian \textbf{O}perator \textbf{Net}work, leverages the expressive power of deep operator networks while integrating neuroscience-inspired neuron models to achieve fast, low-power inference. Unlike traditional surrogates such as Gaussian processes, polynomial chaos expansions, or support vector regression, that may face scalability challenges for high-dimensional, time-dependent reliability problems, CoNBONet offers \textit{fast and energy-efficient inference} enabled by a neuroscience-inspired network architecture, \textit{calibrated uncertainty quantification with theoretical guarantees} via split conformal prediction, and \textit{strong generalization capability} through an operator-learning paradigm that maps input functions to system response trajectories. Validation of the proposed CoNBONet for various nonlinear dynamical systems demonstrates that CoNBONet preserves predictive fidelity, and achieves reliable coverage of failure probabilities, making it a powerful tool for robust and scalable reliability analysis in engineering design.
- Oceania > Australia (0.04)
- North America > United States (0.04)
- Asia > India > Tripura (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Parametric Simplex Method for Sparse Learning
High dimensional sparse learning has imposed a great computational challenge to large scale data analysis. In this paper, we investiage a broad class of sparse learning approaches formulated as linear programs parametrized by a {\em regularization factor}, and solve them by the parametric simplex method (PSM). PSM offers significant advantages over other competing methods: (1) PSM naturally obtains the complete solution path for all values of the regularization parameter; (2) PSM provides a high precision dual certificate stopping criterion; (3) PSM yields sparse solutions through very few iterations, and the solution sparsity significantly reduces the computational cost per iteration. Particularly, we demonstrate the superiority of PSM over various sparse learning approaches, including Dantzig selector for sparse linear regression, sparse support vector machine for sparse linear classification, and sparse differential network estimation. We then provide sufficient conditions under which PSM always outputs sparse solutions such that its computational performance can be significantly boosted. Thorough numerical experiments are provided to demonstrate the outstanding performance of the PSM method.
Variational Autoencoder for Deep Learning of Images, Labels and Captions
A novel variational autoencoder is developed to model images, as well as associated labels or captions. The Deep Generative Deconvolutional Network (DGDN) is used as a decoder of the latent image features, and a deep Convolutional Neural Network (CNN) is used as an image encoder; the CNN is used to approximate a distribution for the latent DGDN features/code. The latent code is also linked to generative models for labels (Bayesian support vector machine) or captions (recurrent neural network). When predicting a label/caption for a new image at test, averaging is performed across the distribution of latent codes; this is computationally efficient as a consequence of the learned CNN-based encoder. Since the framework is capable of modeling the image in the presence/absence of associated labels/captions, a new semi-supervised setting is manifested for CNN learning with images; the framework even allows unsupervised CNN learning, based on images alone.
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.61)
Dual Decomposed Learning with Factorwise Oracle for Structural SVM of Large Output Domain
Many applications of machine learning involve structured output with large domain, where learning of structured predictor is prohibitive due to repetitive calls to expensive inference oracle. In this work, we show that, by decomposing training of Structural Support Vector Machine (SVM) into a series of multiclass SVM problems connected through messages, one can replace expensive structured oracle with Factorwise Maximization Oracle (FMO) that allows efficient implementation of complexity sublinear to the factor domain. A Greedy Direction Method of Multiplier (GDMM) algorithm is proposed to exploit sparsity of messages which guarantees $\epsilon$ sub-optimality after $O(log(1/\epsilon))$ passes of FMO calls. We conduct experiments on chain-structured problems and fully-connected problems of large output domains. The proposed approach is orders-of-magnitude faster than the state-of-the-art training algorithms for Structural SVM.
Fast Uncertainty Quantification for Kernel-Based Estimators in Large-Scale Causal Inference
Kosko, Matthew, J, Falco, Bargagli-Stoffi, null, Wang, Lin, Santacatterina, Michele
Kernel methods are widely used in causal inference for tasks such as treatment effect estimation, policy evaluation, and policy learning. The bootstrap is a standard tool for uncertainty quantification because of its broad applicability. As increasingly large datasets become available, such as the 2023 U.S. Natality data from the National Vital Statistics System (NVSS), which includes 3,596,017 registered births, the computational demands of these methods increase substantially. Kernel methods are known to scale poorly with sample size, and this limitation is further exacerbated by the repeated re-fitting required by the bootstrap. As a result, bootstrap-based inference for kernel-based estimators can become computationally infeasible in large-scale settings. In this paper, we address these challenges by extending the causal Bag of Little Bootstraps (cBLB) algorithm to kernel methods. Our approach achieves computational scalability by combining subsampling and resampling while preserving first-order uncertainty quantification and asymptotically correct coverage. We evaluate the method across three representative implementations: kernelized augmented outcome-weighted learning, kernel-based minimax weighting, and double machine learning with kernel support vector machines. We show in simulations that our method yields confidence intervals with nominal coverage at a fraction of the computational cost. We further demonstrate its utility in a real-world application by estimating the effect of any amount of smoking on birth weight, as well as the optimal treatment regime, using the NVSS dataset, where the standard bootstrap is prohibitively expensive computationally and effectively infeasible at this scale.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
- Education (0.67)
- Health & Medicine > Therapeutic Area > Immunology (0.46)
Learning Confidence Sets using Support Vector Machines
The goal of confidence-set learning in the binary classification setting is to construct two sets, each with a specific probability guarantee to cover a class. An observation outside the overlap of the two sets is deemed to be from one of the two classes, while the overlap is an ambiguity region which could belong to either class. Instead of plug-in approaches, we propose a support vector classifier to construct confidence sets in a flexible manner. Theoretically, we show that the proposed learner can control the non-coverage rates and minimize the ambiguity with high probability. Efficient algorithms are developed and numerical studies illustrate the effectiveness of the proposed method.
Distributionally Robust Graphical Models
In many structured prediction problems, complex relationships between variables are compactly defined using graphical structures. The most prevalent graphical prediction methods---probabilistic graphical models and large margin methods---have their own distinct strengths but also possess significant drawbacks. Conditional random fields (CRFs) are Fisher consistent, but they do not permit integration of customized loss metrics into their learning process. Large-margin models, such as structured support vector machines (SSVMs), have the flexibility to incorporate customized loss metrics, but lack Fisher consistency guarantees. We present adversarial graphical models (AGM), a distributionally robust approach for constructing a predictor that performs robustly for a class of data distributions defined using a graphical structure. Our approach enjoys both the flexibility of incorporating customized loss metrics into its design as well as the statistical guarantee of Fisher consistency. We present exact learning and prediction algorithms for AGM with time complexity similar to existing graphical models and show the practical benefits of our approach with experiments.